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We introduce a variant of the BarndorfT-Nielsen and Shephard stochastic volatility model 
Tj ■ where the non Gaussian Ornstein-Uhlenbeck process describes some measure of trading 

intensity like trading volume or number of trades instead of unobservable instantaneous 
variance. We develop an explicit estimator based on martingale estimating functions in a 
CO ' bivariate model that is not a diffusion, but admits jumps. It is assumed that both the 

quantities are observed on a discrete grid of fixed width, and the observation horizon tends 
to infinity. We show that the estimator is consistent and asymptotically normal and give 
explicit expressions of the asymptotic covariance matrix. Our method is illustrated by a 
finite sample experiment and a statistical analysis on the International Business Machines 
Corporation (IBM) stock from the New York Stock Exchange (NYSE) and the Microsoft 
Ch ' Corporation (MSFT) stock from Nasdaq during a history of five years. 
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^ ■ 1 Introduction 

• ■ In |BNS01| Barndorff-Nielsen and Shephard introduced a class of stochastic volatility models in 

L«. , continuous time, where the instantaneous variance follows an Ornstein-Uhlenbeck type process 

^Q ■ driven by an increasing Levy process. BNS-models, as we will call them from now on, allow 

f~^ , flexible modelling, capture many stylized facts of financial time series, and yet are of great ana- 

lytical tractability. Those models have been studied from various points of view in mathematical 
finance and related fields. Unfortunately, it seems that statistical estimation of the model is 
p\. • the most difficult problem, and most of the work in that area is focused on computationally in- 

JH , tensive methods. In [HP 07] an explicit estimator based on martingale estimating functions was 

developed under the assumptions that returns and volatility are observed. That paper contains 
also further references on ENS models, martingale estimating functions, estimating discretely 
observed diffusions models, etc. The literature on estimation for discretely observed diffusions is 
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vast, a few references are |Uch041 lKP02l IJacOll iKesOOl lK599l IBS95| . In particular, the martin- 
gale estimating function approach is used, developed and studied for example in |S099| . In the 
diffusion setting the major difficulty is that the transition probabilities are not known and are 
difficult to compute. In contrast to that, the characteristic function of the transition probability 
is known in closed form for many BNS models and the transition probability can be computed 
with Fourier methods with high precision. 

In practice volatility is not observed, but many researchers, including, for example, |Kar87[ 
IGT92|, IJL94| have established a connection between volatility and different measures of trading 
intensity, such as traded volume or number of trades. In particular, [Lin07j gives a first applica- 
tion of this approach to BNS models. We take up this idea and combine it with the martingale 
estimating function approach. Measures of trading intensity contain much information about 
the volatility. We identify the volatility with a multiple of some measure of trading intensity, in 
this paper the daily traded volume. In doing so, our bivariate time series is given by the loga- 
rithmic returns and trading volume which are both observable quantities. We explore the joint 
distribution of logarithmic returns X and the instantaneous trading volume/number of trades r. 
The joint conditional moment-generating function of (X, r) is known in closed form and thus we 
obtain closed form expressions for the joint conditional moments up to any desired order. This 
yields a sequence of martingale differences and the martingale estimating function approach is 
used. We employ then the large sample properties, in particular the strong law of large numbers 
for martingales and the martingale central limit theorem. In this way we do not need ergodicity, 
mixing conditions, etc. 

The contributions of the present paper are as follows: first we develop a simple and explicit 
estimator for BNS models using a martingale estimating function approach and identifying the 
volatility with a multiple of trading volume. Secondly, we give proofs of its consistency and 
asymptotic normality. In doing so we compute explicitly the asymptotic covariance matrix. 
Thirdly, we include numerical illustrations and apply our method on real data. 

Since in this analysis we assume that the discrete time variance process Vi is proportional to 
the trading volume/number of trades r^ , we are able to directly model the stochastic volatility 
in asset price dynamics. Due to the analytical tractability of BNS models, we can work with the 
exact dynamics for discrete observations of the continuous time model. We want to stress that 
our approach leads to simple and explicit formulas for the estimator and its asymptotic covari- 
ance matrix, and no simulation or other computer intensive methods are required. Simulations 
are only used to illustrate the finite sample performance in numerical experiments. Finally, we 
apply the method to real data and do a statistical analysis on the International Business Ma- 
chines Corporation (IBM) stock from the New York Stock Exchange (NYSE) and the Microsoft 
Corporation (MSFT) stock from Nasdaq during a volatile history of five years. 

The remainder of the paper is organized as follows: in section ini we describe the class of BNS 
models in continuous time and present two concrete examples, the F— OU and IG-OU model. In 
section 12.21 we introduce the quantities observed in discrete time that are used for estimation. 
In section [3] we present the estimating equations, their explicit solution which is our estimator 
and its consistency and asymptotic normality are proven. In section |4] numerical illustrations are 
presented. In section [5] we apply our results on daily data on the IBM stock from NYSE and the 
MSFT stock from Nasdaq. 



2 The model 

2.1 The continuous time model 

2.1.1 The general setting 

As in Barndorff-Nielsen and Shepard |BNS01j . we assume that the price process of an asset S 
is defined on some filtered probability space {fl,J^, {Tt)t>o,P) and is given by St = S'oexp(Xt) 
with 5*0 > a constant. The process of logarithmic returns X and the instantaneous trading 
volume/number of trades process r satisfy 



dX{t) = (^ + pT{t-))dt + a^T{t-)dWe{t) + pdZx{t), X{Q) = 0. (f ) 

and 

dT{t) = -\T{t-)dt + dZx{t), r(0) = To, (2) 

where the parameters fi,P,p,<T and A are real constants with X,a > 0. The process W^ is a 
standard Brownian motion, the process Z is an increasing Levy process, and we define Z\{t) = 
Z{Xt) for notational simplicity. Adopting the terminology introduced by Barndorff-Nielsen and 
Shepard, we will refer to Z as the background driving Levy process (BDLP). The Brownian 
motion W and the BDLP Z are independent and {!Ft) is assumed to be the usual augmentation 
of the filtration generated by the pair {W, Z\). The random variable tq has a self-decomposable 
distribution corresponding to the BDLP such that the process r is strictly stationary and 

E{t^\ = C, Var[To] = ^. (3) 

For our analysis we will assume that the instantaneous variance process V^ is a constant time the 
trading volume/number of trades r. That is, 

dV(t)^(j^ -dTit), (4) 

with cr > 0. 

Remark 1 Equation ^ implies that the instantaneous variance of log returns is a constant 
multiple of the trading volume/number of trades, and trading volume/number of trades is modelled 
as an OU-type process. 

To shorten the notation we introduce the parameter vector 

6* = (iy,a,A,^,/3,cr,p)^, (5) 

and the bivariate process 

X = (X,r). (6) 

If the distribution of tq is from a particular class D then X is called a BNS-D0U(6') model. 
The process {Xt,Tt)t>o is clearly Markovian. 

2.1.2 The F-OU model 

The F-OU model is obtained by constructing the BNS-model with stationary gamma distribution. 
To ^ F(i^, a), where the parameters are v > Q and a > 0. The corresponding background driving 
Levy process Z is a compound Poisson processes with intensity v and jumps from the exponential 
distribution with parameter a. Consequently both processes Z and r have a finite number of 
jumps in any finite time interval. 



For the F-OU model it is more convenient to work with the parameters v and a. The 
connection to the generic parameters used in our general development is given by 

As the gamma distribution admits exponential moments we have integer moments of all orders 
and our Assumption [T] below is satisfied. 

2.1.3 The IG-OU model 

The IG-OU model is obtained by constructing the BNS-modcl with stationary inverse Gaussian 
distribution, tq ~ ((5,7), with parameters (5 > and 7 > 0. 

The corresponding background driving Levy process is the sum of an IG((5/2, 7) process and 
an independent compound Poisson process with intensity (^7/2 and jumps from an F(l/2,7^/2) 
distribution. Gonsequently both processes Z and r have infinitely many jumps in any finite time 
interval. 

For the IG-OU model it is more convenient to work with the parameters 8 and 7. The 
connection to the generic parameters used in our general development is given by 

As the inverse Gaussian distribution admits exponential moments we have integer moments of 
all orders and our Assumption [T] below is satisfied. 

2.2 Discrete observations 

The following description is rather analogous to |HP07[ Section 2.2]. The only (but important) 
exception is the introduction of the parameter a in p2p and (j2ip . We observe returns and the 
trading volume/number of trades process on a discrete grid of points in time 

= to < <i < . . . < i„, (9) 

which relates trading volume/number of trades and the instantaneous variance of log returns. 
This implies 

riti) = r(f,_i)e-^(*--''-^) + / " e~^^'^~'^ dZ,^{,). (10) 

Using 

r, :=T(t,), [/.:=/' e-^(**-^)dZA(s) (11) 

we have that {Ui)i>\ is a sequence of independent random variables, and it is independent of tq. 
If the grid is equidistant, then {Ui)i->\ are iid. Observing the returns X on the grid we have 

X{t,) - XiU^i) = ^i{U - i^-i) + l3{YiU) - y(f,-i)) 

(12) 



'ti-i 
where 



+ a / ^Tis-)dW{s) + piZxiU) ~ Zx{U^i)), 
Jti-i 

Y{t) = f T{s-)ds (13) 

^0 



is the integrated trading volume/number of trades process. This suggests introducing the discrete 
time quantities 

X, = X{U)-X{U^i), Y,^YiU)^Y{U^i), Z, ^ ZxiU) - ZxiU-i) (14) 

and 

1 



W,^-= y/T{s-)dW{s). (15) 



/Y,, 



Furthermore, it is also convenient to introduce the discrete quantity 

S, = \i.Z,- Ui). (16) 

It is not difficult to see (conditioning!) that (Wi)i>i is an iid A^(0, 1) sequence independent from 
all other discrete quantities. Wc note also that (C/i, Zi)i>i is a bivariate iid sequence, but Ui and 
Zi are obviously dependent. 

From now on, for notational simplicity, we consider the equidistant grid with 

tk = A:A, (17) 

where A > is fixed. This implies 

n = 7T.-1 + U, (18) 

and 

y, = eT,_i + S,, (19) 

where 

7 = e-^^, ^=^- (20) 

Furthermore, 

X, = fiA + l3Y,+a^,W,+pZ,. (21) 

The sequence {Xi,Ti)i>Q is clearly Markovian. From now on we assume all moments of the 
stationary distribution of tq exist. 

Assumption 1 

E[t^] < oo Vn e N. (22) 

In the estimating context we assume all moments are finite with respect to all probability mea- 
sures Pd,6 d O under consideration, where Q is the parameter space. 

No other assumptions are made, and all conditions required for consistency and asymptotic 
normality of our estimator will be proven rigorously from that assumption. 

Proposition 1 We have for all n e N that 

E[Z^] < oo, E[U^] < oo, E[S^] < oo, (23) 

and 

^[Yi"] < oo, E[W^] < oo, EiX"^] < oo. (24) 

Consequently the expectation of any (multivariate) polynomial in Zi,Ui, Si, \/Y\, W\ , X\ exists 
under Pg . 

Proof: The proof is given in [HPO?) Proposition 1] . 

Let us remark that, by the stationarity, the above result holds also for Zi,Ui,Si, y^, Wi,Xi 
instead of Zi,Ui,Si, \/Yi, Wi,Xi, where i e N is arbitrary. 



3 A theoretical framework of the estimation procedure 

3.1 The estimating equations and their exphcit solution 

The reader familiar with |HP07| wiU notice that the following developments are quite similar to 
the paper mentioned, the main (but important) difference is an additional estimating equation 
for the new parameter a. 

For estimation purposes we consider a probability space on which a parameterized family of 
probability measures is given: 

{n,T,{Pe:e^Q]), (25) 

where 6 = {0 G R^ : 6*1 > 0, 6*2 > 0, 6*^ > 0,9^ > 0}. The data is generated under the true 
probability measure Pg^ with some ^o G 0. The expectation with respect to Pg is denoted by 
Eg[] and with respect to P^o simply by E[.]. 

We assume there is a process X that is BNS-D0U(6') under Pg. We want to find an estimator 
for do using observations Xi, . . . , X„, ri, . . . , r„. We are interested in asymptotics as n ^ oo. To 
that purpose let us consider the following martingale estimating functions: 



Gl{e)^Yrk=i[rkTu-i- P{ru-i,e)\, 

Gi{e) = Yrk^,[Xk- .fir.^ue)], 
Gl{e) = Trk^,[XkTu-f{Tk-uO)\, 






= Ee[Ti\TQ = i] 

= Eg[TlTo\To = l] 

^Ee[Tl\To = L] 

= Eg[Xi\To = L] 
= Eg[XiTQ\To = l] 
= Eg[XiTi\To = l] 

= Ee[Xf\To = i] 



(26) 



Lemma 1 We have the explicit expressions 






P)v 



= 7t+(l-7)C 
= 7^2 + (1 - 7)Ci 

= 6V + 27(l-7)a+(l-7)'C' + (l 
= f3et + A/1 + (3 AC - (3eC + AApC 
= /3et2 + (A/, + (3AC - /3eC + AApO^ 

= /3e7t2 + A/X7i + T]X{AI3€V + e(2 + AXv)p) + C(/3t(e(l - 27) + A7) + AA(/L(e + "fup)) 
= A^T]u0^ + e'^T]ul3^ - 2Aer]vl3^ + ei(e(t - 2C) + 2AQ)P^ + 4A?7p/3 - 4er?/o/3 + 2Ae/zi/3 
+ 2AVC/3 - 2AeK/3 + A^/i^ + A'^r^X^vp^ + ea^{i - C) + Aa^^ 

2/32 A77 - 4/32 oy 



A (2/3?7i/pA2 + 2^pCA^ + 2'qp^A - 2/3e7/j/pA + 2/3eptCA) 
A2 



A 



(27) 



Proof: The formulas are special cases of the general moment calculations given in |HP07|, Ap- 
pendix A]. 



D 



The estimator 0„ is obtained by solving the estimating equation Gn{0) = and it turns out 
that this equation has a simple explicit solution. 



Proposition 2 The estimating equation Gn{dn) = admits for every n > 2 on the event 

Cn - {C - ^nVl > 0, W^ - (vlf > 0} (28) 

a unique solution 0„ — (i^„, a„, A„, /i„, /3„, cr„, p„) that is given by 

— 1-1- 

^"" -1+7?. 

A„ = -log(7„)/A; 

e-n = (1 -7n)/K; 

/Q VSn '^n^nJ 

Pn^{- [in^ni-Kf + enA„(?7„ + K\)2 - vl) + W^) - £,1^^ + ^n) I i'^^nVnK)] 
M„ - ( - AA„p„Cn - /3n(AC« + en(-Cn + W^)) + ^n) I ^\ 



CT„ = yan/b„ 



4/3„(-A+e„)»7„A„p„+/3g(-2AT)„+e„(>)„(2+£„A„)+e„A„((tj;^)^-tj;^))) + A„(-2A»)„A„p^-(C^)^+C^) . 



6„ = AC„ + e„(-Cn + w,\); 
w/iere 



W — „ Z^ '''* Sn — n Z^ TiTi-1 ^n — n ^ ^ 



i=l 



11/ It, IV II, 



(29) 
(30) 
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i = ^En-i ^^ = iE^ti (31) 






Proof: The first three equations G^„{9) = 0, for j — 1,2,3 contain only the unknowns CjVt^ 
and are easily solved. In fact we get a familiar estimator for the first two moments and the 
autocorrelation coefficient of an AR(1) process. The last four equations G'^(^) = 0, for j = 
4,5,6,7 can be seen as a linear system for the unknowns ii,f3,p,a, once the other parameters 
have been determined. 

D 

Remark 2 The exceptional set Cn could be simplified to 

C'n = {C - ^A > 0} (32) 

Since the jump times and the jump sizes of the BDLP are independent, and the former have an 
exponential distribution it follows that tq, . . . , t„ is with probability one not constant, so P[u^ — 
{v^Y' > 0] = 1. Furthermore, for finite n we have P[^^ — £,nV^ < 0] > 0. For definiteness we 
put 9n — outside Cn ■ 



3.2 Consistency and asymptotic normality 

Let us investigate the consistency and asymptotic normality of the estimator from the previous 
section. 

Theorem 1 We have P{Cn) -^ 1 when n —^ oo and the estimator On is consistent on Cn, 
namely 

^n * fo 

on Cn as n —> oo. 

Proof: From |HP07| Lemma 4] it follows that for all integers p,q,r > we have 

n 

^J2xfr!rU^E[Xfr!r5] (33) 

as n ^- oo. Using this results it easily follows that 

C -^n^n^ Covin, to) >0, (34) 

SO P{Cn) — + 1 as n — + oo. Furthermore, from ([33]) it follows that the empirical moments in ((30|) 
and (PT|) converge to their theoretical counterparts, ^,\ — ^ ^* and u^ ^-4 u', where 

C^ = AC(M + ApC)+/3(ery + AC2), v'^^+V- 

e = 2er,Xp + AC(m + XpC) + K^V + AC'), 

^'^ = /32(2Ar7 - 2e?7 + A^XC)/\ + 2/3(2Ar/p - 2er]p 

+A\{p + XpO) + A{2r,Xp^ + a^C + A(m + ApC)') 

Plugging the limits into (|29p shows, after a short mechanical calculation, that the estimator is 
consistent. 

D 

In order to prove asymptotic normality, we use the general framework and results of |S099| . We 
extend the theory in the case of a bivariate Markov process. To apply |S0991 Theorem 2.8], 
requires to show that |S0991 Condition 2.6] is satisfied. 

Proposition 3 The Condition 2.6 of \S099^ is satisfied. 

Proof: For a concise vector notation we introduce 

^k — {Tk,TkTk-l,Tf.,Xk,XkTk-l,XkTk,X^.) , (36) 

and we write the estimating equations in the form 



(35) 



Glie)^Y.H-f{r,^i,d)], * = !,..., 7. (37) 

Looking at ((27|) we note that /'(i, 6) is a polynomial in i, namely 

r(i,0) = ^(/.,,fc(0)i^ (38) 



fc=0 



where the degree pi and the coefficients (j>i^k{0), which are smooth functions in 9, can be read off 
from (PT)) . Now the proof is completely analogous to that of [HP 071 Proposition 4] . 

D 
Finally, we have all the ingredients for proving the following result. 

Theorem 2 The estimator 

6* = (i/„,a„, A„,^„,/3„,cr„,p„) (39) 

is asymptotically normal, namely 

V^[k-Oo]^N{0,T), (40) 

as n —f oo, where 

T^Aier'TiAie)-')"^, T,, ^ E[Cov{E\,E{\to)] (41) 



AA(^) = E 



w/^^-'\ 



(42) 



d 
Remark 3 Looking at i38\) . we see that -ttjt P{tq, 0) is a polynomial in tq and thus we can find 

explicit expressions for the entries of A. Similar arguments allow us to obtain explicit expressions 
for T, see fHPOT^ . 

Proof: From |HP07|, Proposition 3] it follows that 

-^GniOo) ^ N{0,T), (43) 

as n — > oo, where 

T,, = E[CoviE\,E{\To)] (44) 

and (S*, i = 1, . . . , 7) is defined by (|55|) . Using the just obtained result and Proposition [31 the 
result follows directly from ||S099, Theorem2.8]. 

D 

4 A numerical illustration of the finite sample performance 
of the estimator 

4.1 Description of the model and its parameter values 

To illustrate the results from the previous sections numerically, we consider the P-OU model 
from Section 12.1.21 where the trading volume r has a stationary gamma distribution. The 
corresponding BDLP Z is a compound Poisson process with intensity v and jumps from the 
exponential distribution with parameter a. We use as time unit one year consisting of 250 trading 
days. The true parameters are 

^ = 6.17, a = 1.42, A = 177.95, /3 = -0.015, p = -0.00056, /i = 0.435, cr = 0.087. (45) 

The parameters imply that there are on average 4.4 jumps per day and the jumps in the BDLP 
and in the trading volume are exponentially distributed with mean and standard deviation 0.704. 



n = 2500 


l^n 


a« 


A„ 


Mean 
MSE 

MAE 


6.2145 (0.2552) 
0.0672 (0.1046) 
0.2016 (0.1629) 


1.435 (0.0588) 
0.0036 (0.0055) 
0.047 (0.0369) 


177.865 (8.9257) 

79.5956 (115.6766) 

7.0692 (5.4454) 


n = 8000 


l^n 


an 


A„ 


Mean 
MSE 

MAE 


6.1642 (0.1424) 
0.0203 (0.0283) 
0.1135 (0.0862) 


1.4186 (0.0329) 
0.0011 (0.0016) 
0.0259 (0.0203) 


177.1342 (5.208) 

27.7663 (39.1414) 

4.2191 (3.1584) 



Table 1: Estimated means, MSE and MAE for the parameters Vn 
standard deviations in brackets. The true values are v = 6.17, a - 



an, An and the corresponding 
-- 1.42, A = 177.95. 



n = 2500 


M« 


/?n 


CTn 


Pn 


Mean 
MSE 

MAE 


0.4402 (0.1849) 
0.0342 (0.0492) 
0.1483 (0.1105) 


0.0148 (0.053) 
0.0028 (0.0039) 
0.0428 (0.0313) 


0.0871 (0.0013) 

1.82-10-'' (2.33-10-'3) 

0.0011 (0.0008) 


-5.65-10-'' (1.43-10-'') 

2-10-* (3-10-«) 
1.12-10-4 (8.85-10-5) 


n = 8000 


^J■n 


Pn 


fn 


Pn 


Mean 
MSE 
MAE 


0.4388 (0.1002) 
0.01 (0.0138) 
0.08 (0.0604) 


0.0129 (0.0278) 
0.0008 (0.0011) 
0.0222 (0.0169) 


0.0872 (7.45 ■ lO"'') 
5.56 ■lO-'^ (8.35-10-'^) 
5.81-10-4 (4.68-10-*) 


-5.55-10-" (8.07-10-5) 

6.5-10-9 (1- 10-«) 

6.3-10-5 (5.04-10-5) 



Table 2: Estimated means, MSE and MAE for the parameters /^n , /3n , cr„ , p„ and the corre- 
sponding standard deviations in brackets. The true values are f3 — — 0.015, p = — 0.00056, /i = 
0.435,(7 = 0.087. 



The interpretation is, that typically every day 4 or 5 new pieces of information arrive and make 
the trading volume process jump. The stationary mean of the trading volume is 4.35, and of 
the variance is 0.033. Hence, if we define instantaneous volatility to be the square root of the 
variance, it will fluctuate around 18% in our example. The half-life of the autocorrelation of the 
variance process is about a day. 

In our example annual log returns have (unconditional) mean —6.5% and annual volatility 
18.2%. We will perform the estimation procedure for two different sample sizes, namely 2500 
and 8000, corresponding to 10 years and 32 years respectively, with 250 daily observation per 
year. 

4.2 Simulation study 

We first simulate 1000 samples of n = 2500 equidistant observations of Xi and r^, i = 1, . . . , n. 
Table 1 summarizes the estimation results of our simulation study concerning the parameters 
i^,a,X,fi,/3,a,p. 

Figure [T] displays a simulation of ten years of daily observations from the background driving 
Levy process, the instantaneous trading volume process, the volatility process and log returns for 
i = 1 , . . . , 2500. The empirical mean of all the estimated parameter values Vn , a„ , A„ , /i„ , /3„ , an , p„ 
is shown in the first line, with the empirical standard deviations in brackets. We also estimated 
mean square error (MSE) and mean absolute error (MAE), again with the standard deviation in 
brackets. The corresponding results for a sample size of n = 8000 observations are reported in 
the last three lines of Table [T] and Table [H 
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When one compares the estimates for the different sample sizes, it can be seen that the MSE 
reduces for all seven estimators, when the sample size is increased and the reduction is roughly 
of a factor of 4 which would correspond to the asymptotic properties of the estimators. 



4.3 The asymptotic covariance matrix and the finite sample distribu- 
tion of the estimator 

As our goal is an analysis of the estimator, we do not estimate the asymptotic covariance, but 
evaluate the explicit expression using the true parameters. Denoting the vector of asymptotic 
standard deviations of the estimates and the correlation matrix by s/^/n resp. r we have 

(46) 



s = [12.0257, 2.7878, 443.85, 9.0211, 2.5536, 0.0657, 0.007]^ 



1 


0.938 


0.5778 


0.0074 


0.0511 


0.0062 


-0.0026 


0.938 


1 


0.5738 


0.0076 


0.0507 


0.0126 


-0.0039 


0.5778 


0.5738 


1 


0.011 


0.0884 


-0.00056 


-6.2 10- 


0.0074 


0.0076 


0.011 


1 


-0.8265 


-0.0128 


0.0296 


0.0511 


0.0507 


0.0884 


-0.8265 


1 


0.012 


-0.5148 


0.0062 


0.0126 


-0.00056 


-0.0128 


0.012 


1 


-0.0045 


-0.0026 


-0.0039 


-6.2- 10-1^ 


0.0296 


-0.5148 


-0.0045 


1 



(47) 



We will discuss what this values of s implies for the sample size of 2500 below. The corre- 
lations among parameters related to the returns distribution, namely /i, cr, p and j3, are rather 
small except for (3 and p. In contrast to that, correlations among the trading volume parameters, 
namely z^, a and A are very high. Theoretically, this can be addressed using the optimal martin- 
gale estimating function approach, even though the corresponding equations can not be solved 
explicitly and the optimal estimator has to be obtained by numerical optimization, see JHP06| 
for developments in this direction. 

Figure [2] illustrates the empirical distribution of the simple estimators for the F-OU model. 
The histograms are produced from m = 1000 replications consisting of n = 2500 observations 
each, corresponding to 10 years with 250 daily observations per year. Both from the graphs and 
the asymptotic standard deviations we see that the parameter u can be estimated quite accurately 
to at least one digit of precision. The parameter a is estimated even better with almost two digits 
of precision. The autocorrelation parameter A is estimated slightly less accurate. The parameter 
a which connects the trading volume/number of trades and volatility is estimated quite accurate 
with one to two digits of precision. This means that if the relation between volatility and trading 
volume is exploited, not too much uncertainty is introduced by the estimating procedure. This 
can be also very promising for option pricing purposes. The bad quality of the estimator for (3 
is neither surprising nor very troublesome. It has little impact on the model. The main reason 
for including the parameter /3 in the specification of BNS models is, for derivatives pricing: A 
risk-neutral BNS-model must have (3 = —1/2. In most applications working under a physical 
probability measure /? = can be assumed without much loss of generality or flexibility. For the 
same reason the parameter p, is not very relevant although it can be estimated more accurate 
than (3. Even though the value of the leverage parameter p is rather small, it can be estimated 
very accurately. 



4.4 Estimation of the volatihty ^/Vt 

Recall from ^ , that we assumed that the instantaneous variance is a multiple of trading volume 
and thus we have the following equation for the volatility 



0\fT\, 



i G 



(48) 
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n = 2500 


mean(e) 


Std(e) 


skcw(e) 


kurt(£) 


Mean 
MSE 
MAE 


0.11753 (0.03455) 

0.015(0.0086) 
0.11753 (0.03455) 


1.02865 (0.0067) 
0.00087 (0.00039) 
0.02865 (0.0067) 


-0.04114 (0.06107) 
0.00542 (0.00753) 
0.05885 (0.04423) 


3.3018 (0.17843) 
0.12289 (0.14759) 
0.30291 (0.17654) 


n = 8000 


mean(£) 


Std(e) 


skew(e) 


kurt(e) 


Mean 

MSE 
MAE 


0.1166 (0.01951) 
0.01398 (0.00456) 
0.1166 (0.01951) 


1.02838 (0.00373) 
0.00082 (0.00021) 
0.02838 (0.00373) 


-0.03876 (0.03476) 
0.00271 (0.00333) 
0.04335 (0.02883) 


3.29937 (0.10449) 
0.10053 (0.07529) 
0.299938 (0.10446) 



Table 3: Estimated mean, MSE and MAE for the mean, standard deviation, skewness and 
kurtosis of the residuals with corresponding estimated standard deviations in brackets. 



Since we observe r^ at integer times, an estimate of the volatility process a^/ri can therefore be 
calculated from (P5|) and it is plotted in Figure [3] together with the exact volatility (<7y^)i>i 
for one simulated path. The estimator for a is calculated from the simulated path and since a 
is very accurate, the two graphs are almost indistinguishable. 

Next we investigate the goodness of fit of our estimation method by a residual analysis. Recall 
from pT|) . the estimated residuals are given by 

{X, - AA - PY, - pZ,) /<70^, I e N (49) 

where the integrated instantaneous trading volume Yi is given by (fT9|) . The quantity Zi is not 
observable, but we can find an approximation Zi as follows. For the integral we use simple Euler 
approximations 



Y,, = I T{s-)ds 
Iti-l 



Ti A and 



^f^dW{s) « VtTa e. 



i e N 



(50) 



ti-i 



with Ei are approximately A^(0, 1) and i.i.d. The estimated residuals are given by 



ii = {X, -fiA- PnA - pZi)/(Ty/T,-A 

where 

Z, = (XA + 1)t, ~ n_i. (51) 

Since we assume that {Wi)i>o is an iid A^(0, 1) sequence, our goal is to check if the residuals 
e(-) are iid and A^(0, 1). The residuals should be symmetric around zero and thus their mean 
and skewness should be close to zero. Furthermore, we expect the kurtosis to be close to three. 
Consequently, we estimated mean, MSE, MAE and the corresponding standard deviations for 
the mean, the standard deviation, the skewness and the kurtosis of the residuals ii based on 1000 
simulations. The results for both sample sizes are reported in Tableland indicate a reasonable 
fit. The correlation of the squared residuals was checked by performing a Ljung-Box test for each 
sample. For n — 2500 we computed the test statistic based on 50 = -\/2500 lags and had to reject 
the null hypothesis of no correlation 60 times out of 1000 simulations at the 0.05 level. Whereas 
for n = 8000 the test statistic was computed using 90 « VSOOO lags and the null hypothesis was 
rejected 66 times out of 1000 simulations again at the 0.05 level. 

5 Real data analysis 

The BNS model will be fitted to daily log returns of the International Business Machines Cor- 
poration (IBM) stock and the Microsoft Corporation (MSFT) stock. The IBM stock is from the 
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Parameter 


Value IBM 


St.dev. 


j> 


6.17 


0.339 


a 


1.42 


0.079 


A 


177.95 


12.509 


A 


0.435 


0.254 


/3 


-0.015 


0.072 


tj 


0.087 


0.002 


P 


-0.00056 


0.0002 



Parameter 


Value MSFT 


St.dev. 





4.496 


0.247 


a 


67.895 


3.773 


A 


201.99 


14.420 


A 


0.4162 


0.265 


/5 


-0.464 


5.059 


(7 


0.81 


0.018 


P 


-0.025 


0.013 



Table 4: Estimated parameter values. 



New York Stock Exchange (NYSE), whereas MSFT belongs to Nasdaq. The data spans over 
roughly 5 years starting in March 23, 2003 to March 23, 2008 for IBM and starting in April 11, 
2003 to February 4, 2008 for MSFT. There were 1259 and 1212 observations for IBM and MSFT 
respectively of daily closing prices and trading volumes. The resulting time series are shown in 
Figure [Hand Figure El Data on trading volumes are expressed in millions. Sample measures of 
skewness and kurtosiqll of the returns are —0.35 and 7.42 respectively for IBM and —0.68 and 
14.2 for MSFT. Throughout, we consider the F-OU model from section [5X1 

5.1 Parameter estimates and interpretation 

Table [D presents the estimated parameter values for the IBM and MSFT stocks. In the IBM 
case, for example, the parameters imply that there are on average 4.4 jumps per day each with 
mean and standard deviation 0.704. Typical volatility is 0.18 with standard deviation 0.11. The 
proportionality of trading volume and the instantaneous variance is given by cr^ = 0.0076. The 
leverage p is very small. 

5.2 Returns distribution 

The estimated parameters in the IBM case, for example, imply that the mean of daily log-returns 
including or not a leverage effect in the model equal —0.027%, and 0.146% respectively. If the 
trading volume process jumps by a typical size, the returns jump by 0.0004. Using the estimated 
parameters, the volatility processes for the IBM and MSFT stocks are shown in Figure [H In Ta- 
ble [5] some results on the marginal moments of daily log returns and the instantaneous variance 
process V{t) using the estimated model parameters are presented. Furthermore, the theoretical 
density and log density of log returns and the estimated ones are shown in Figure [T5] and Fig- 
ure [THl A systematic method how to calculate the theoretical density of log returns is given in 
Appendix [SI 

5.3 The autocorrelation function 

The theoretical autocorrelation function for the variance process and the estimated autocorre- 
lation for both the stocks are shown in Figure [7| which is not very satisfactory. We will address 
this issue in the concluding remarks below. 



^Skewness is measured as fJ,zl \^ IJ.\, and kurtosis as /X4/^|, where fii is the ith central moment. 
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Unconditional moments IBM 


Value 


EX] 


-0.027% 


St.dev[X] 


1.15% 


E[V] 


3.3% 


St.dev[V] 


1.32% 



Unconditional moments MSFT 


Value 


EX] 


0.02% 


St.dev[X] 


1.32% 


E[V] 


4.34% 


St.dev[V] 


2.05% 



Table 5: Unconditional moments calculated from the estimated parameters. 





mean(£) 


std(e) 


skew(£) 


kurt(£) 


IBM 


-0.01568 


1.03142 


-0.17053 


5.40659 


MSFT 


-0.01375 


1.00386 


-0.35846 


7.86258 



Table 6: Mean, standard deviation, skewness and kurtosis of the IBM residuals. 



5.4 The model fit 

To investigate the model fit, we performed a Ljung-Box test for squared residuals of the data set. 
The test statistic used 35 lags of the corresponding empirical autocorrelation function. The null 
hypothesis was not rejected for MSFT at the 0.05 level. For the MSFT squared residuals the 
p-value was 0.052 The test statistic for the IBM squared residuals was equal to 451.61, which led 
to a rejection of the null hypothesis, since the test had a critical value of 113.15 at the 0.05 level. 
This result is also obvious from Figure [S] where the empirical autocorrelation function of the 
squared residuals is plotted, showing significant correlations of the IBM residuals. The empirical 
autocorrelation functions of the squared residuals for MSFT is shown in Figure [9l Furthermore, 
the autocorrelation fmictions of £(•) for both the stocks are shown in Figure [HI 

The estimated mean, standard deviation, skewness and kurtosis of the residuals for both 
the stocks are summarized in Table [HI The numbers show that the mean and variation of the 
residuals are according to our model, but the residuals seem to have heavier tails than the normal 
distribution, see Figure fTOl and Figure fTTI The IBM residuals pass the Kolmogorov-Smirnov test 
of normalitjo, for example, with p- value 0.0886, whereas the test statistic for the MSFT residuals 
was equal to 0.0622, which lead to a rejection of the null hypothesis, since the test had a critical 
value of 0.0389 at the 0.05 level. Log returns for the IBM and MSFT stocks and their residuals 
are shown in Figure [TH and Figure 1131 

6 Conclusion and further developments 



We introduce a new variant of the Barndorff-Nielsen and Shephard stochastic volatility model 
where the non Gaussian Ornstein-Uhlenbeck process describes some measure of trading intensity 
like trading volume or number of trades instead of unobservable instantaneous variance. 

This allows us to implement a martingale estimating function approach and obtain an explicit 
consistent and asymptotically normal estimator. We first perform the numerical finite sample 
experiment to assess the quality of our procedure and then apply the obtained results to real 
stock data. 



^In |Lin07| a similar approach is used, with number of trades as a measure of trading intensity. In that 
work, superposition of two OU-processes are analyzed for the modelling procedure, but without a leverage in the 
specification of returns. It is also pointed out that typically their approach gives normalized returns with heavier 
tails than the normal distribution for illiquid stocks and for special dates such as the trading day before a holiday. 
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According to the residual analysis and to the return distribution, the model fit is in many 
aspects quite satisfactory except for the autocorrelation function of the trading volume. The 
graph indicates that superposition of OU-processes could be used for a more accurate description 
of the autocorrelation function, see also [Li nO?) IGS06| . but it is not clear how to extend the 
martingale estimating function approach in this direction. This is left open for future research. 

In this paper the empirical analysis uses trading volume. It would be interesting to compare 
the results to a similar analysis using number of trades as suggested by [Lin07J . 

The present analysis was performed for daily data, but the approach applies for any sampling 
frequency since it is based on the continuous time specification of the Barndorff-Nielsen and 
Shephard model. In particular, the approach could be applied directly to high frequency data. 

Further and alternative developments like optimal quadratic estimating functions, use of 
trigonometric moments and comparison to the generalized method of moments suggested in 
|HP07| apply also to the present framework. 

A Some numerical and analytical aspects of the density 
function of log returns 

In this section we compute and analyze in detail the distribution and the density function of log 
returns (Xi)i>i. By stationarity, it is sufficient to show the results for Xi. The main tool for 
the computation is the well-known key formula, see for example |NV03[ IER99| . 

A.l Cumulant of (Zi, Ui) 

It is convenient to introduce the bivariate cumulant function 

kzi,Ui{hi,h2) = K[hi,h2tZi,Ui]. 
Using relation (fTTjl and the key formula it easily follows that 



.A 
kz,.Ui{hi,hi) ^logE[exp{hiZi + h2Ui}] =A / fczi(/ii + /12 exp{-A(A 

Jo 



s)})ds, 



where kz^ is the cumulant function of Zi. Moreover, kz^ is explicit and related with any self- 
decomposable law through the well-know formula given in jBNSOlj . 

Remark 4 In the two concrete specifications given in sections \2.1.2\ and \2.1.3\ , namely the 
T-OU and the IG-OU, the cumulant function of Zi is 

kzAh) = ^^ and kzAh) ^ h6h^ - 2h)-^^^ 
a — h 

respectively. 

A. 2 Cumulant of {Yi,Zi) 

It is convenient to introduce the bivariate cumulant function 

kY,Mhi,h2) ^ K[hi,h2tYi,Zi]. 
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Using relations (fTU|) . (|13p . the key formula and the independency of Z\ and tq, it follows that 
fcFi,Zi(^i,/i2) = log£;[exp{/iiYi + ft,2^i}] 

= kr„ihi ■ ex{A)) + \ [ kzAhi-exiA~s) + h2)ds, (52) 

where fc^o is the cumulant function of the trading volume/number of trades process. 

A. 3 Cumulant of Xi 

Finally, we are able to calculate the cumulant function of log returns according to the obtained 
expressions of bivariate cumulants in sections lA. II and lA. 21 Furthermore, using relation (pij) and 
the key formula it follows that 

kxAh) = \ogE[exp{hXi}] 

= \ogE{E[eii-p{hfiA + h(3Yi + hay/y\Wi + hpZi}\Zi,Yi] } 

= log£;[exp{VA + /3/iYi + phZx + — — ^}] (53) 

= hnA + kz^,Yt{l3h~\ —,ph) 

= hfiA + kr,{{ph+^)-ex{A))+X f kz,{{f3h+^)-ex{A~s)+ph)ds. 
^ Jo ^ 

Since in the F-OU case we have that 

kro (h) = v log and kz^ Qi) = , 

a — h a — 11 

integrating out the cumulant function of Zi , it follows that 
kx^ih) = huA + vlog- 7777777— ^3X?T + ^ / kz,{{(ih+^—-)-ex{A^s)+ph)ds 



a-ex{A){(3h+^] 

) 

aX - ^A/t(2/3eA(A) + 2p + ex{A)a^h) 



2a 



\v< hA{2l3 + 2Xp + (7^h) + 2alog 
+ ^^ 



A(q; — ph) 



2a\-h{2l3 + 2Xp + a^h) 

Furthermore, the density function of log returns will be calculated by Laplace inversion. Denoting 
the density function of log returns by fx{x), we have 



1 f°° 

- Re{exp{kxAw)-ixy})dy, 

■^ Jo 



fxi{x)^-\ Re{exp{kxAiy) - ixy})dy, (55) 

■^ Jo 

where Re{-) denotes the real part of a complex number. The numerical integration is performed 
in MATLAB using the function quadgk. 
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Figure 1: Instantaneous BDLP Z, number of trades/trading volume r^, the volatility process 
c X yjrl and daily log returns X^. 
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Figure 2: Simulated distribution of the simple estimators for the F-OU model. The histograms 
are produced from m = 1000 replications consisting of n = 2500 observations each. The true 
values are v = 6.17, a = 1.42, A = 177.95, /i = 0.435, /3 = -0.015, p = -0.00056, a = 0.087. The 
standard deviations used for the normal curves are taken from the explicit asymptotic results, 
not estimated. 
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Figure 3: Sample paths of (J^frl (solid line) and a^frl (dashed line) of one simulation. 
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Figure 4: Left: Closing prices for the IBM stock from March 23, 2003 to March 23, 2008. Right: 
Trading volumes for the IBM stock during the same time period. 
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Figure 5: Left: Closing prices for the MSFT stock from April 11, 2003 to February 4, 2008. 
Right: Trading volumes for the MSFT stock during the same time period. 
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Figure 6: The estimated volatility process ax^/r for IBM (top) and MSFT {bottom). 
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Figure 7: The autocorrelation function for the variance process and the estimated theoretical 
autocorrelation for IBM (left) and for MSFT (right). 
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Figure 8: Left: The empirical autocorrelation function for the squared log returns for IBM during 
the period March 23, 2003 to March 23, 2008. Right: The empirical autocorrelation function for 
the squared residuals for IBM during the same period. The straight lines are the asymptotic 
95% confidence bands ±1.96y^, where n is the number of observations. 
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Figure 9: Left: The empirical autocorrelation function for the squared log returns for MSFT 
during the period April 11, 2003 to February 4, 2008. Right: The empirical autocorrelation 
function for the squared residuals for MSFT during the same period. The straight lines are the 
asymptotic 95% confidence bands ±1.96-\/n, where n is the number of observations. 





Normal Probability Piot 




0.999 
0.997 

0.99 
0.98 




/ ■ ^ 
+ " 


0.95 




/ 


0.90 


J 


f 


>, 0.75 


/ 


- 


|0.50 
°- 0.25 


/ 


' 


0.10 


/ 


- 


0.05 


/ 


- 


0.02 
0.01 

0.003 
0.001 


+ . . / . 


- 









Normai Probabiiity Plot 










II 


' '. + 


0.999 








+ 


0.997 
0.99 








y 


0.98 








¥ 


0.95 








- 


0.90 








- 


* 0.75 








- 


1 0.50 








- 


o 




















"- 0.25 








- 


0.10 








- 


0.05 








- 


0.02 






jfW / 


- 


0.01 






/ ' 


- 


0.003 






+.++' ■/ . 


- 


0.001 


+ 




i 


' 



-0.06 -0.04 -0.02 0.02 0.04 

Data 



-4-2024 
Data 



Figure 10: Left: The normal probability plot of log returns for IBM during the period March 
23, 2003 to March 23, 2008. Right: The normal probability plot of residuals for IBM during the 
same period. 
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Figure 11: Le]t: The normal probability plot of log returns for MSFT during the period April 
11, 2003 to February 4, 2008. Right: The normal probabihty plot of residuals for MSFT during 
the same period. 
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Figure 12: Left: Log returns for IBM during the period March 23, 2003 to March 23, 2008. 
Right: The residuals for IBM during the same time period. 



24 





2004 2005 2006 2007 



Figure 13: Left: Log returns for MSFT during the period April 11, 2003 to February 4, 2008. 
Right: The residuals for MSFT during the same time period. 





Figure 14: The autocorrelation function for the residuals for IBM (left) during the period March 
23, 2003 to March 23, 2008 and MSFT {right) during the period April, 11, 2003 to February 4, 
2008 . 
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Figure 15: Theoretical densities (dashed hne) and kernel estimates of the empirical ones (solid 
line) of log returns for IBM (left) and MSFT (right). 
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Figure 16: Theoretical log densities (dashed line) and kernel estimates of the empirical ones 
(sohd line) of log returns for IBM {left) and MSFT [right). 
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